Parallel Sorting on a Shared-Nothing Architecture using Probabilistic Splitting
نویسندگان
چکیده
We consider the problem of external sorting in a shared-nothing multiprocessor. A critical step in the algorithms we consider is to determine the range of sort keys to be handled by each processor. We consider two techniques for determining these ranges of sort keys: exact splitting, using a parallel version of the algorithm proposed by Iyer, Ricard, and Varman; and probabilistic splitting, which uses sampling to estimate quantiles. We present analytic results showing that probabilistic splitting performs better than exact splitting. Finally, we present experimental results from an implementation of sorting via probabilistic splitting in the Gamma parallel database machine.
منابع مشابه
Parallel Sorting on a Shared-Nothing Architecture using Probabilistic Splitting
Sorting large datasets is often limited by I/O bandwidth in terms of memory and disk. The traditional von Neumann architecture results in high cache misses within L1-L3 cache levels. The authors realized that the highly parallelized processors and the fast memory interconnects inside commodity GPUs can help to work around the limitations that arise when sorting is done solely on the CPU. Specif...
متن کاملA parallel sort-balance mutual range-join algorithm on hypercube computers
This paper presents an eecient parallel algorithm for computing the mutual range-join of N sets of numbers on shared-nothing hypercube computers. The algorithm iteratively joins each set to the mutual range-join of the preceding sets. Each join is performed on all processors of the hypercube in parallel. The algorithm uses a global sorting method to distribute the elements of the rst set evenly...
متن کاملPerformance Evaluation of a Two-Level Hierarchical Parallel Database System
Two typical architectures of parallel database systems are the shared-everything and shared-nothing architectures. Shared-everything architecture provides better performance than the shared-nothing architecture but it is not scalable to large system sizes. On the other hand, shared-nothing architecture provides good system scalability but is sensitive to data skew. Hierarchical architectures ha...
متن کاملStack splitting: A technique for efficient exploitation of search parallelism on share-nothing platforms
We study the problem of exploiting parallelism from search-based AI systems on share-nothing platforms, i.e., platforms where different machines do not have access to any form of shared memory. We propose a novel environment representation technique, called stack-splitting, which is a modification of the well-known stack-copying technique, that enables the efficient exploitation of or-paralleli...
متن کاملA Scalable Parallel Sorting Algorithm Using Exact Splitting
Sorting is one of the most fundamental algorithmic kernels, used by a large fraction of computer applications. This paper proposes a novel parallel sorting algorithm based on exact splitting that combines excellent scaling behavior with universal applicability. In contrast to many existing parallel sorting algorithms that make limiting assumptions regarding the input problem or the underlying c...
متن کامل